W14. Shortest Paths: Dijkstra, Bellman-Ford, and Floyd-Warshall

Author

Nikolai Kudasov

Published

April 21, 2026

1. Summary

1.1 Shortest-Path Problem Family

1.1.1 Core Problem and Applications

Given a weighted directed graph \(G = (V, E)\) and two vertices \(u, v \in V\), the shortest-path problem asks for a path from \(u\) to \(v\) whose total weight is as small as possible. This is one of the central graph problems because many real systems are naturally modeled as weighted networks: packets move through routers, travelers move through cities, jobs move through dependency pipelines, and compilers move through state spaces when optimizing code.

The shortest-path weight from \(u\) to \(v\) is

\[ \delta(u,v) = \begin{cases} \min\{w(p) : p \text{ is a path from } u \text{ to } v\}, & \text{if such a path exists}, \\ \infty, & \text{otherwise.} \end{cases} \]

A shortest path is any path \(p\) whose weight equals \(\delta(u,v)\). If every edge has weight \(1\), then minimizing total weight is exactly the same as minimizing the number of edges, so \(\delta(u,v)\) becomes the minimum hop count.

1.1.2 Standard Variants

The lecture distinguishes several standard shortest-path variants, and the choice of algorithm depends on which variant is required.

Single-pair shortest path: compute a shortest path from one source \(u\) to one target \(v\).
Single-source shortest paths (SSSP): compute shortest paths from one fixed source \(s\) to every vertex.
Single-destination shortest paths: compute shortest paths from every vertex to one fixed destination \(t\).
All-pairs shortest paths (APSP): compute shortest paths for every ordered pair \((u,v)\).

The single-destination problem reduces to single-source by reversing every edge. In the reversed graph, shortest paths from all vertices to \(t\) in the original graph become shortest paths from \(t\) to all vertices.

1.2 Optimal Substructure, Negative Cycles, and Relaxation

1.2.1 Optimal Substructure

Shortest paths have the crucial property of optimal substructure. If

\[ p = \langle v_0, v_1, \dots, v_k \rangle \]

is a shortest path from \(v_0\) to \(v_k\), then every subpath

\[ \langle v_i, v_{i+1}, \dots, v_j \rangle \]

is itself a shortest path from \(v_i\) to \(v_j\). Otherwise, if some subpath were not shortest, it could be replaced by a strictly cheaper path, which would make the whole path \(p\) cheaper as well. That would contradict the assumption that \(p\) is shortest.

This idea is the reason both greedy algorithms and dynamic-programming algorithms work for shortest paths. Dijkstra relies on it locally, while Floyd–Warshall relies on it through a recurrence over restricted intermediate vertices.

1.2.2 Negative Edges and Negative Cycles

Negative edge weights do not by themselves make the shortest-path problem meaningless. A graph may contain negative edges and still have perfectly well-defined shortest paths. The real obstacle is a reachable negative-weight cycle.

If a path from \(s\) to \(v\) can enter a directed cycle whose total weight is negative, then going around that cycle one more time always makes the path cheaper. Repeating the cycle arbitrarily many times drives the total weight downward without bound, so there is no finite minimum. In that case the natural value is not a finite number at all; informally it behaves like \(-\infty\).

So the fundamental distinction is:

negative edges without reachable negative cycles: shortest paths are still well defined;
reachable negative cycles: shortest-path weights may fail to exist as finite minima.

1.2.3 Relaxation

The three main algorithms in this lecture are all built around relaxation. In single-source problems, each vertex \(v\) stores:

a distance estimate \(v.d\), which is an upper bound on the true distance \(\delta(s,v)\);
a predecessor \(v.\pi\), which records the previous vertex on the current best-known path.

The primitive operation is:

\[ \text{RELAX}(u,v): \quad \text{if } v.d > u.d + w(u,v), \text{ then set } v.d \leftarrow u.d + w(u,v), \; v.\pi \leftarrow u. \]

Each relaxation either improves an estimate or leaves it unchanged. The algorithms differ mainly in the order in which they perform these relaxations:

Dijkstra chooses the next vertex greedily;
Bellman–Ford repeatedly relaxes every edge;
Floyd–Warshall relaxes all ordered pairs through progressively allowed intermediate vertices.

1.3 Dijkstra’s Algorithm

1.3.1 Greedy Idea and Relation to Prim

Dijkstra’s algorithm solves the single-source shortest-path problem when all edge weights are nonnegative. It resembles Prim’s minimum-spanning-tree algorithm very closely:

both algorithms maintain a priority queue of vertices;
both repeatedly extract one vertex with minimum key;
both update neighboring vertices through decrease-key operations.

The difference lies in the meaning of the key:

in Prim, the key of \(v\) is the cheapest edge connecting \(v\) to the current tree;
in Dijkstra, the key of \(v\) is the current best known source-to-\(v\) distance estimate.

The greedy invariant is stronger in Dijkstra: once a vertex \(u\) is extracted from the queue, its estimate is final:

\[ u.d = \delta(s,u). \]

This is correct only because all edge weights are nonnegative. Nonnegativity guarantees that any alternative path reaching \(u\) later cannot be cheaper than the path that already produced the minimum extracted estimate.

1.3.2 Pseudocode Structure

The CLRS-style structure is:

Initialize all vertices with distance \(\infty\) and predecessor NIL, except the source with distance \(0\).
Insert all vertices into a min-priority queue keyed by \(v.d\).
Repeatedly extract the vertex with minimum estimate.
Relax all outgoing edges of that extracted vertex.

After termination, the predecessor pointers form a shortest-path tree rooted at the source, provided all weights are nonnegative.

1.3.3 Running Time

With adjacency lists and a binary heap:

EXTRACT-MIN is performed \(|V|\) times at \(O(\log V)\) each;
DECREASE-KEY is performed at most \(|E|\) times at \(O(\log V)\) each;
scanning adjacency lists contributes \(O(E)\) total.

Therefore the total time is

\[ O(E \log V). \]

With Fibonacci heaps:

EXTRACT-MIN costs \(O(\log V)\) amortized;
DECREASE-KEY costs \(O(1)\) amortized;

so the total time becomes

\[ O(E + V \log V). \]

The lecture also emphasizes the \(d\)-ary heap expression

\[ (dV + E)\log_d V. \]

For dense graphs with \(E = \Theta(V^2)\), this becomes \(\Theta(V^2 \log_d V)\).

1.3.4 Why Dijkstra Fails on Negative Edges

The correctness proof of Dijkstra depends on the claim that once a vertex is extracted, it can never be improved later. A negative edge destroys exactly this claim. A path discovered afterward may enter a not-yet-processed region of the graph and then return through a negative edge, creating a smaller value for a vertex that Dijkstra already finalized.

This is why the problem is not merely a technicality: with negative weights, the algorithm’s core greedy step is no longer safe.

1.4 Bellman–Ford Algorithm

1.4.1 Fundamental Observation

Any simple path in a graph with \(|V|\) vertices uses at most \(|V| - 1\) edges. If shortest paths are well defined, then each shortest path can be taken simple; otherwise a repeated vertex would create a cycle that could be removed without increasing the path weight.

This means that to compute all shortest paths from a source, it is enough to ensure correctness for paths using at most \(|V| - 1\) edges.

1.4.2 Algorithm and Negative-Cycle Detection

Bellman–Ford implements this observation directly:

initialize all estimates;
repeat \(|V| - 1\) times: relax every edge in the graph;
perform one extra pass over all edges;
if any estimate still decreases, report a reachable negative cycle.

The logic is clean. After one full pass, all shortest paths using at most one edge are correct. After two passes, all shortest paths using at most two edges are correct. After \(k\) passes, all shortest paths using at most \(k\) edges are correct. Hence after \(|V| - 1\) passes, all simple shortest paths have been covered.

The extra pass is not for computing distances. It is a certificate check: if some edge still relaxes, then some path using at least \(|V|\) edges is improving an estimate, and that can only happen because a reachable negative cycle exists.

1.4.3 Complexity and Use Cases

Each pass examines all edges once, which costs \(\Theta(E)\). Since there are \(\Theta(V)\) passes, the total running time is

\[ O(VE). \]

This is asymptotically slower than Dijkstra on nonnegative graphs, but Bellman–Ford is the correct tool when:

negative edge weights may occur;
reachable negative cycles must be detected explicitly;
a simple, uniform edge-relaxation procedure is preferable to a greedy queue discipline.

1.5 Properties of Shortest-Path Estimates

The lecture isolates several properties that appear in correctness proofs. These are worth studying because they explain why relaxation-based algorithms converge.

1.5.1 Triangle Inequality and Upper Bounds

For any edge \((u,v)\),

\[ \delta(s,v) \le \delta(s,u) + w(u,v). \]

This is the triangle inequality for shortest paths: going from \(s\) to \(v\) directly by a shortest path cannot be worse than first going to \(u\) and then taking the edge \((u,v)\).

At the same time, relaxation algorithms preserve the upper-bound property:

\[ v.d \ge \delta(s,v) \]

at all times. Estimates may be too large, but they are never too small.

1.5.2 No-Path, Convergence, and Path Relaxation

If no path from \(s\) to \(v\) exists, then both the true distance and the estimate remain \(\infty\). This is the no-path property.

The convergence property says that if \((u,v)\) lies on a shortest path and \(u.d\) has already become exact before we relax \((u,v)\), then afterward \(v.d\) also becomes exact. This follows from the triangle inequality plus the update rule.

The path-relaxation property generalizes this. If the edges of a shortest path are relaxed in the path’s order, one after another, then the destination estimate eventually becomes exact. Bellman–Ford exploits this over repeated full passes, and Dijkstra exploits it through the fact that nonnegative edges let exactness spread outward in increasing-distance order.

1.6 Floyd–Warshall Algorithm

1.6.1 All-Pairs Perspective

The all-pairs shortest-path problem asks for \(\delta(u,v)\) for every ordered pair \((u,v)\). For this task, a matrix-based view is natural. Floyd–Warshall stores a distance matrix and improves it by dynamic programming.

Unlike Dijkstra and Bellman–Ford, which are source-centered, Floyd–Warshall is pair-centered. It asks how the best route from \(i\) to \(j\) changes as more and more vertices are allowed as intermediate stops.

1.6.2 Dynamic-Programming State

Number the vertices as \(1, 2, \dots, n\). Define

\[ d_{ij}^{(k)} \]

to be the weight of the shortest path from \(i\) to \(j\) whose internal vertices are all chosen from \(\{1,2,\dots,k\}\).

This is the right subproblem family because shortest paths are naturally built from intermediate vertices, not from a fixed number of edges. A shortest path may have many edges, but what matters for the recurrence is whether it is allowed to pass through the pivot vertex \(k\).

1.6.3 Recurrence and Loop Order

For each stage \(k\),

\[ d_{ij}^{(k)} = \min\left(d_{ij}^{(k-1)}, \; d_{ik}^{(k-1)} + d_{kj}^{(k-1)}\right). \]

This recurrence has a direct interpretation:

either the best path from \(i\) to \(j\) does not use vertex \(k\) as an intermediate vertex, so the old value \(d_{ij}^{(k-1)}\) remains best;
or the best path does use \(k\), in which case it splits into a shortest path from \(i\) to \(k\) and a shortest path from \(k\) to \(j\), both using only vertices up to \(k-1\) internally.

The outermost loop must be over \(k\). If \(i\) or \(j\) were outermost instead, then some entries would be updated using values from the wrong stage, breaking the dynamic-programming dependency structure.

1.6.4 Complexity, In-Place Updates, and Negative Cycles

The algorithm performs three nested loops over the vertices, so its running time is

\[ \Theta(V^3). \]

The distance matrix uses

\[ \Theta(V^2) \]

space, and storing predecessors requires another \(\Theta(V^2)\) if path reconstruction is needed.

The lecture also notes two important implementation facts:

the algorithm can be implemented in place, overwriting the distance matrix entry by entry, but this requires careful reasoning about dependencies;
if the final matrix has some diagonal entry \(d_{ii} < 0\), then the graph contains a negative-weight cycle through vertex \(i\).

1.7 Choosing the Right Algorithm

For the three algorithms in this lecture, the main selection rule is:

use Dijkstra for single-source shortest paths when all edge weights are nonnegative;
use Bellman–Ford for single-source shortest paths when negative edges may appear or when reachable negative cycles must be detected;
use Floyd–Warshall when a full all-pairs distance matrix is required.

All three are built from the same conceptual core — relaxation and optimal substructure — but they organize the computation in very different ways.

2. Definitions

Shortest-path weight \(\delta(u,v)\): The minimum total weight of a path from \(u\) to \(v\), or \(\infty\) if no such path exists.
Shortest path: A path whose total weight equals the shortest-path weight between its endpoints.
Single-source shortest paths (SSSP): The problem of computing distances from one source vertex to all vertices.
All-pairs shortest paths (APSP): The problem of computing distances for all ordered pairs of vertices.
Distance estimate \(v.d\): The algorithm’s current upper bound on the true shortest-path distance from the source to \(v\).
Predecessor \(v.\pi\): The previous vertex on the current best known path to \(v\).
Relaxation: The update step that replaces \(v.d\) by \(u.d + w(u,v)\) when that value is smaller.
Optimal substructure: The property that every subpath of a shortest path is itself shortest.
Negative-weight edge: An edge with weight less than \(0\).
Negative-weight cycle: A directed cycle whose total edge weight is negative.
Triangle inequality: For every edge \((u,v)\), \(\delta(s,v) \le \delta(s,u) + w(u,v)\).
Upper-bound property: During relaxation algorithms, estimates always satisfy \(v.d \ge \delta(s,v)\).
No-path property: If \(v\) is unreachable from the source, then \(v.d\) remains \(\infty\).
Convergence property: If \(u.d\) is exact and \((u,v)\) lies on a shortest path, then relaxing \((u,v)\) makes \(v.d\) exact.
Path-relaxation property: Relaxing the edges of a shortest path in order eventually makes the destination estimate exact.
Dijkstra’s algorithm: A greedy single-source shortest-path algorithm correct for nonnegative edge weights.
Bellman–Ford algorithm: A single-source shortest-path algorithm that handles negative edges and detects reachable negative cycles.
Floyd–Warshall algorithm: A dynamic-programming algorithm for all-pairs shortest paths based on allowed intermediate vertices.

3. Formulas

Shortest-path definition: \(\delta(u,v)=\min\{w(p): p \text{ is a path from } u \text{ to } v\}\)
Relaxation update: If \(v.d > u.d + w(u,v)\), then \(v.d \leftarrow u.d + w(u,v)\) and \(v.\pi \leftarrow u\)
Triangle inequality: \(\delta(s,v) \le \delta(s,u) + w(u,v)\)
Dijkstra with binary heap: \(O(E \log V)\)
Dijkstra with Fibonacci heap: \(O(E + V \log V)\)
Dijkstra with \(d\)-ary heap: \((dV + E)\log_d V\)
Bellman–Ford running time: \(O(VE)\)
Floyd–Warshall recurrence: \(d_{ij}^{(k)}=\min\left(d_{ij}^{(k-1)}, d_{ik}^{(k-1)} + d_{kj}^{(k-1)}\right)\)
Floyd–Warshall running time: \(\Theta(V^3)\)
Floyd–Warshall space usage: \(\Theta(V^2)\)

4. Examples

4.1. Run Dijkstra from Vertex A (Lecture 12, Task 1)

Run Dijkstra’s algorithm on the directed graph with edges

\[ A \to B(3), \; A \to D(1), \; B \to C(2), \; B \to E(5), \; B \to D(6), \; C \to F(6), \; C \to E(2), \; D \to E(1), \; D \to B(8), \; E \to F(3), \]

starting from vertex \(A\).

Click to see the solution

Key Concept: Dijkstra repeatedly finalizes the vertex with the smallest current estimate. Because all edge weights are nonnegative, once a vertex is extracted its estimate is final.

Initialize:

\[ d(A)=0, \qquad d(B)=d(C)=d(D)=d(E)=d(F)=\infty. \]

We also set every predecessor to NIL.

Step	Extracted vertex	Successful relaxations	Distances after the step
0	—	initialize only	\(A=0,\; B=\infty,\; C=\infty,\; D=\infty,\; E=\infty,\; F=\infty\)
1	\(A\)	\(B \leftarrow 3\) via \(A\), \(D \leftarrow 1\) via \(A\)	\(A=0,\; B=3,\; C=\infty,\; D=1,\; E=\infty,\; F=\infty\)
2	\(D\)	\(E \leftarrow 2\) via \(D\)	\(A=0,\; B=3,\; C=\infty,\; D=1,\; E=2,\; F=\infty\)
3	\(E\)	\(F \leftarrow 5\) via \(E\)	\(A=0,\; B=3,\; C=\infty,\; D=1,\; E=2,\; F=5\)
4	\(B\)	\(C \leftarrow 5\) via \(B\)	\(A=0,\; B=3,\; C=5,\; D=1,\; E=2,\; F=5\)
5	\(C\)	none	\(A=0,\; B=3,\; C=5,\; D=1,\; E=2,\; F=5\)
6	\(F\)	none	final

Now check the nontrivial relaxations explicitly.

From \(A\): \[d(B) = 0 + 3 = 3, \qquad d(D) = 0 + 1 = 1.\]
From \(D\): \[d(E) = d(D) + w(D,E) = 1 + 1 = 2.\] The edge \(D \to B\) gives \(1 + 8 = 9\), which does not improve \(d(B)=3\).
From \(E\): \[d(F) = d(E) + w(E,F) = 2 + 3 = 5.\]
From \(B\): \[d(C) = d(B) + w(B,C) = 3 + 2 = 5.\] The candidate value for \(E\) is \(3 + 5 = 8\), which is worse than \(2\).
From \(C\): \[d(F) \le 5 + 6 = 11 \quad \text{and} \quad d(E) \le 5 + 2 = 7,\] so nothing improves.

The final shortest-path distances are:

\[ d(A)=0,\quad d(B)=3,\quad d(C)=5,\quad d(D)=1,\quad d(E)=2,\quad d(F)=5. \]

The predecessor pointers describe the shortest-path tree:

\(\pi(D)=A\)
\(\pi(B)=A\)
\(\pi(E)=D\)
\(\pi(C)=B\)
\(\pi(F)=E\)

So one set of shortest paths is:

\(A \leadsto D\)
\(A \leadsto B\)
\(A \leadsto D \leadsto E\)
\(A \leadsto B \leadsto C\)
\(A \leadsto D \leadsto E \leadsto F\)

Answer: The final distances are \(d(A)=0\), \(d(B)=3\), \(d(C)=5\), \(d(D)=1\), \(d(E)=2\), and \(d(F)=5\), with predecessor tree edges \(A\to D\), \(A\to B\), \(D\to E\), \(B\to C\), and \(E\to F\).

4.2. Explain Why Dijkstra Fails with Negative Edge \(C \to D\) (Lecture 12, Task 2)

Run Dijkstra’s algorithm on the lecture graph with a negative edge and explain why the algorithm can produce an incorrect result.

Click to see the solution

Key Concept: Dijkstra assumes that once a vertex leaves the priority queue, its estimate can never improve again. Negative edges break exactly that assumption.

Suppose a vertex \(D\) is extracted with some estimate \(d(D)\). Later the algorithm may discover a path

\[ s \leadsto C \to D \]

where the edge \(C \to D\) has negative weight. Even if the prefix \(s \leadsto C\) is found only afterward, the total

\[ d(C) + w(C,D) \]

may become smaller than the supposedly final value of \(d(D)\).

At that point Dijkstra has no repair mechanism, because extracted vertices are never returned to the queue. The algorithm therefore locks in a value that may later turn out to be too large.

So the failure is not accidental. It is structural:

Dijkstra finalizes vertices permanently.
Negative edges allow later improvements to earlier vertices.
Therefore the greedy invariant is false.

This is why Dijkstra is valid only when all edge weights are nonnegative.

Answer: Dijkstra fails because a later path through the negative edge \(C \to D\) can improve a vertex that was already finalized, violating the algorithm’s core greedy invariant.

4.3. Trace Dijkstra from Vertex L on the Large Lecture Graph (Lecture 12, Task 3)

Run Dijkstra’s algorithm on the large graph from the lecture slide, starting from vertex \(L\).

Click to see the solution

Key Concept: On a large graph, Dijkstra is still the same repeated routine: extract the smallest tentative distance, relax all outgoing edges, and record each predecessor update.

The slide image does not list every edge textually, so the most reliable self-study solution from the available source is the exact procedure to apply to the original diagram.

Initialize: \[d(L)=0,\] and every other distance is \(\infty\).
Insert all vertices into a min-priority queue keyed by their tentative distance.
Repeatedly:
- extract the vertex \(u\) with minimum tentative distance;
- mark \(u\) as finalized;
- for every outgoing edge \((u,v)\), perform relaxation.
Continue until the queue becomes empty.

While tracing the algorithm on paper, maintain a table with:

current queue minimum,
current distances,
predecessor changes,
the set of finalized vertices.

Three correctness checks help you catch mistakes:

Extracted distances must be nondecreasing.
Every predecessor update must satisfy \[d(v) = d(\pi(v)) + w(\pi(v),v).\]
The predecessor pointers at the end must form a tree rooted at \(L\).

So, even though the OCR transcript does not preserve the full edge list, the solving method is completely determined: it is the standard Dijkstra trace used in Task 4.1, only on a larger graph.

Answer: Use the same Dijkstra trace as in Task 4.1, starting from \(L\); the exact numeric trace depends on the original slide graph, whose full edge list is not preserved in the transcript.

4.4. Run Bellman-Ford from Source \(E\) (Lecture 12, Task 4)

Run Bellman–Ford on the 5-vertex lecture graph with source \(E\), using the edge set

\[ A \to B(6), \; A \to D(7), \; B \to C(5), \; B \to D(8), \; B \to E(-4), \; C \to B(-2), \; D \to C(-3), \; D \to E(9), \; E \to A(2), \; E \to C(7). \]

Click to see the solution

Key Concept: After pass \(k\), all shortest paths using at most \(k\) edges are correct.

Initialize:

\[ d(E)=0, \qquad d(A)=d(B)=d(C)=d(D)=\infty. \]

We now perform \(|V|-1=4\) full passes over all edges.

Pass	Distance estimates after the pass
0	\(A=\infty,\; B=\infty,\; C=\infty,\; D=\infty,\; E=0\)
1	\(A=2,\; B=\infty,\; C=7,\; D=\infty,\; E=0\)
2	\(A=2,\; B=5,\; C=6,\; D=9,\; E=0\)
3	\(A=2,\; B=4,\; C=6,\; D=9,\; E=0\)
4	\(A=2,\; B=4,\; C=6,\; D=9,\; E=0\)

Now justify the important updates.

Pass 1

Only the outgoing edges of \(E\) can help, because all other vertices still have estimate \(\infty\):

\[ d(A) = 0 + 2 = 2, \qquad d(C) = 0 + 7 = 7. \]

Pass 2

Using the new value of \(A\):

\[ d(B) \le d(A) + 6 = 8, \qquad d(D) \le d(A) + 7 = 9. \]

Then using the new value of \(D\):

\[ d(C) \le d(D) - 3 = 9 - 3 = 6, \]

which improves the old value \(7\).

Then using the new value of \(C\):

\[ d(B) \le d(C) - 2 = 6 - 2 = 4, \]

but this particular improvement appears cleanly only after one more full pass because Bellman–Ford propagates information edge by edge according to the scan order.

Pass 3

The value of \(B\) becomes exact:

\[ d(B)=4. \]

No further edge can improve any estimate.

So the final shortest-path distances from \(E\) are:

\[ d(A)=2,\quad d(B)=4,\quad d(C)=6,\quad d(D)=9,\quad d(E)=0. \]

In the extra negative-cycle check pass, no edge relaxes further, so there is no reachable negative cycle in this graph.

Answer: The final Bellman-Ford distances from \(E\) are \(d(A)=2\), \(d(B)=4\), \(d(C)=6\), \(d(D)=9\), and \(d(E)=0\), and the graph has no reachable negative cycle.

4.5. Detect a Negative Cycle with Bellman-Ford (Lecture 12, Task 5)

Use Bellman–Ford to explain why the lecture’s second 5-vertex graph contains a reachable negative cycle.

Click to see the solution

Key Concept: If Bellman-Ford can still improve some estimate after \(|V|-1\) passes, then some reachable walk uses a repeated vertex, and the repeated cycle must have negative total weight.

Bellman–Ford detects a reachable negative cycle by performing one extra full pass after the usual \(|V|-1\) passes.

The lecture trace shows exactly the warning sign we are looking for: the distance estimates continue decreasing from one iteration to the next instead of stabilizing. In particular, the handwritten states keep improving across later passes, which means the algorithm is repeatedly finding cheaper walks.

That behavior has only one explanation:

a walk is using more than \(|V|-1\) edges,
yet it is still improving the estimate,
therefore that walk must repeat a vertex,
and the repeated cycle must have negative total weight.

So the graph contains a reachable negative-weight cycle. For every vertex reachable from that cycle, the shortest-path value is not a finite minimum, because looping around the cycle one more time always produces a cheaper path.

Answer: The graph contains a reachable negative-weight cycle, because the Bellman-Ford estimates keep decreasing even after the normal \(|V|-1\) passes instead of stabilizing.

4.6. Prove Why \(|V|-1\) Passes Suffice (Lecture 12, Task 6)

Let \(G\) be directed and assume no negative-weight cycle is reachable from the source \(s\). Explain why \(|V|-1\) outer passes over all edges suffice for Bellman–Ford.

Click to see the solution

Key Concept: In the absence of reachable negative cycles, every shortest path can be chosen simple.

A simple path in a graph with \(|V|\) vertices uses at most \(|V|-1\) edges.
After one Bellman–Ford pass, all shortest paths using at most one edge are correctly represented.
After two passes, all shortest paths using at most two edges are correctly represented.
By induction, after \(k\) passes, all shortest paths using at most \(k\) edges are correctly represented.

Now take any vertex \(v\) reachable from \(s\). Since no reachable negative cycle exists, there is a shortest path from \(s\) to \(v\) that is simple. Therefore it uses at most \(|V|-1\) edges. After \(|V|-1\) passes, Bellman–Ford has propagated exactness along all edges of that path, so \(d(v)=\delta(s,v)\).

Thus \(|V|-1\) passes are sufficient for all vertices.

Answer: \(|V|-1\) passes suffice because every shortest path can be taken simple, and a simple path in a graph with \(|V|\) vertices uses at most \(|V|-1\) edges.

4.7. Modify Bellman-Ford for the Bound \(m < |V|-1\) (Lecture 12, Task 7)

Suppose every shortest path from \(s\) uses at most \(m\) edges, where \(m < |V|-1\). Describe how to modify Bellman–Ford so that it performs at most \(m+1\) outer passes.

Click to see the solution

Key Concept: If every shortest path uses at most \(m\) edges, Bellman-Ford only needs \(m\) ordinary relaxation passes, plus one optional final pass for negative-cycle detection.

If every shortest path uses at most \(m\) edges, then Bellman–Ford does not need to spend \(|V|-1\) full passes propagating information farther than \(m\) edges from the source.

The modification is:

perform at most \(m\) ordinary passes over all edges;
keep a Boolean variable changed;
during each pass, set changed = true if some relaxation succeeds;
if a pass ends with changed = false, stop early because all estimates have already converged;
after the ordinary passes, do one additional pass for negative-cycle detection.

This gives at most \(m+1\) passes total.

Why is it correct? Because after pass \(k\), Bellman–Ford is correct for all shortest paths using at most \(k\) edges. By assumption, every shortest path of interest uses at most \(m\) edges, so after \(m\) passes all shortest distances are already correct. The extra pass is kept only to test whether some reachable negative cycle still allows an improvement.

Answer: Replace the usual \(|V|-1\) outer passes by at most \(m\) ordinary passes with early stopping, then keep one extra pass for negative-cycle detection, for a total of at most \(m+1\) passes.

4.8. Run Floyd-Warshall on the 5-Vertex Graph (Lecture 12, Task 8)

Run Floyd–Warshall on the 5-vertex lecture graph. Use the vertex order \(A, B, C, D, E\).

Click to see the solution

Key Concept: Floyd-Warshall gradually enlarges the set of allowed intermediate vertices; after stage \(k\), the matrix stores shortest paths whose internal vertices come only from the first \(k\) vertices in the chosen order.

We start from the initial weight matrix \(D^{(0)}\), where diagonal entries are \(0\), edge weights are written explicitly, and absent edges are \(\infty\).

Initial matrix

	A	B	C	D	E
A	0	6	\(\infty\)	7	\(\infty\)
B	\(\infty\)	0	5	8	-4
C	\(\infty\)	-2	0	\(\infty\)	\(\infty\)
D	\(\infty\)	\(\infty\)	-3	0	9
E	2	\(\infty\)	7	\(\infty\)	0

Now process vertices one by one as allowed intermediate vertices.

After allowing \(A\) as an intermediate vertex: \(D^{(1)}\)

	A	B	C	D	E
A	0	6	\(\infty\)	7	\(\infty\)
B	\(\infty\)	0	5	8	-4
C	\(\infty\)	-2	0	\(\infty\)	\(\infty\)
D	\(\infty\)	\(\infty\)	-3	0	9
E	2	8	7	9	0

Only the paths \(E \to A \to B\) and \(E \to A \to D\) improve values:

\[ 8 = 2 + 6, \qquad 9 = 2 + 7. \]

After allowing \(B\) as an intermediate vertex: \(D^{(2)}\)

	A	B	C	D	E
A	0	6	11	7	2
B	\(\infty\)	0	5	8	-4
C	\(\infty\)	-2	0	6	-6
D	\(\infty\)	\(\infty\)	-3	0	9
E	2	8	7	9	0

The main updates are:

\[ A \to C = 6 + 5 = 11, \] \[ A \to E = 6 + (-4) = 2, \] \[ C \to D = -2 + 8 = 6, \] \[ C \to E = -2 + (-4) = -6. \]

After allowing \(C\) as an intermediate vertex: \(D^{(3)}\)

	A	B	C	D	E
A	0	6	11	7	2
B	\(\infty\)	0	5	8	-4
C	\(\infty\)	-2	0	6	-6
D	\(\infty\)	-5	-3	0	-9
E	2	5	7	9	0

Important updates:

\[ D \to B = -3 + (-2) = -5, \] \[ D \to E = -3 + (-6) = -9, \] \[ E \to B = 7 + (-2) = 5. \]

After allowing \(D\) as an intermediate vertex: \(D^{(4)}\)

	A	B	C	D	E
A	0	2	4	7	-2
B	\(\infty\)	0	5	8	-4
C	\(\infty\)	-2	0	6	-6
D	\(\infty\)	-5	-3	0	-9
E	2	4	6	9	0

Important updates:

\[ A \to B = 7 + (-5) = 2, \] \[ A \to C = 7 + (-3) = 4, \] \[ A \to E = 7 + (-9) = -2, \] \[ E \to B = 9 + (-5) = 4, \] \[ E \to C = 9 + (-3) = 6. \]

After allowing \(E\) as an intermediate vertex: \(D^{(5)}\)

	A	B	C	D	E
A	0	2	4	7	-2
B	-2	0	2	5	-4
C	-4	-2	0	3	-6
D	-7	-5	-3	0	-9
E	2	4	6	9	0

These last improvements come from routes that pass through \(E\), for example:

\[ B \to A = B \to E \to A = -4 + 2 = -2, \] \[ C \to A = C \to E \to A = -6 + 2 = -4, \] \[ D \to A = D \to E \to A = -9 + 2 = -7. \]

The final all-pairs distance matrix is therefore

\[ D^{(5)} = \begin{bmatrix} 0 & 2 & 4 & 7 & -2 \\ -2 & 0 & 2 & 5 & -4 \\ -4 & -2 & 0 & 3 & -6 \\ -7 & -5 & -3 & 0 & -9 \\ 2 & 4 & 6 & 9 & 0 \end{bmatrix}. \]

All diagonal entries are \(0\), not negative, so this graph contains no negative-weight cycle.

Answer: The final all-pairs distance matrix is \[ \begin{bmatrix} 0 & 2 & 4 & 7 & -2 \\ -2 & 0 & 2 & 5 & -4 \\ -4 & -2 & 0 & 3 & -6 \\ -7 & -5 & -3 & 0 & -9 \\ 2 & 4 & 6 & 9 & 0 \end{bmatrix}, \] and since all diagonal entries remain \(0\), the graph has no negative-weight cycle.